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ABSTRACT 


This  report  presents  the  concept  of  square  root  filters 
and  the  closely  related  U-D  covariance  factorization  filter  as 
viable  alternatives  to  conventional  Kalman  filters.  For  a 
modest  increase  in  computational  loading,  one  obtains  optimal 
filter  algorithms  equivalent  to  the  Kalman  filter  if  infinite 
wordlength  is  assumed,  but  with  vastly  superior  numerical 
characteristics  with  finite  wordlength.  These  algorithms  are 
at  least  as  good  a solution  to  troublesome  measurement  update 
computations  as  implementing  a Kalman  filter  in  double  precision, 
since  the  Kalman  filter  inherently  involves  unstable  numerics. 

The  filter  algorithms  are  developed  and  presented  in  a form 
convenient  for  implementation.  Of  the  covariance  square  root 
forms,  the  Carlson  filter  is  more  tfficient  than  the  Potter 
form  computationally,  and  it  also  maintains  triangularity  of 
the  square  root  matrices.  The  U-D  covariance  factorization 
filter  is  even  more  efficient,  not  requiring  square  root  compu- 
tations. In  comparison,  the  inverse  covariance  square  root 
filter  is  often  considerably  more  burdensome,  although  it  too 
becomes  competitive  if  the  measurement  dimension  is  very  large. 


SOLUTIONS  TO  THE  KALMAN  FILTER  WORDLENGTN  PROBLEJI; 

SQUARE  ROOT  AND  U-*)  COVARIANCE  FACTORIZATIONS 

1.  Introduction;  The  Kalman  Filter 

The  Kalman  filter  Is  the  optimal  state  estimator  for  a system 
described  by  a linear  model  driven  by  white,  Gaussian  noise. 

On  the  basis  of  such  a model  and  lncoiq>lete,  noise-corrupted  data  from 
available  sensors.  It  provides  estimates  of  the  system  state  variables 
that  are  optimal  with  respect  to  many  different  criteria  for  evaluating 
filter  performance. 

Assume  that  modelling  techniques  have  produced  an  adequate 
system  description  in  the  form  of  a linear  stochastic  differential 
equation  to  describe  the  state  propagation,  with  discrete-time  noise- 
corrupted  linear  measurements  available  as  system  outputs.  To 
describe  this  model,  let  a stochastic,  process  be  defined  as  a real 
vector-valued  function  of  two  arguments,  the  first  of  which  is  an 

element  of  some  time  set  T of  interest  (continuous  or  discrete),  and 
the  second  an  element  u of  a probability  sample  space  il,  such  that 
for  any  fixed  t out  of  the  set  T,  x(t,*)  is  a vector-valued 
random  variable.  If  we  fix  the  second  argument  Instead  of  the  first,  we 
can  say  that,  to  each  point  out  of  H there  can  be  associated  a 

time  function  x(*,u)j^)  ■ ®ach  of  which  Is  a sample  from  the  ' 

stochastic  process.  Sometimes  the  second  argument  Is  not  Included 
explicitly  In  the  notation,  so  the  underscored  tilde  Is  used  to 
distinguish  between  a process  a^(')  and  one  of  Its  samples  a(*). 

'U 

For  Instance,  one  can  develop  a stochastic  process  model  ^(*,*)  to 


describe  the  oieasureinents  becoming  available  over  time  from 


sensors , 


while  a single  sample  ^(' = 5.^’)  would  correspond  to  a single 
measurement  time  history,  and  a(t^,Uj^)  = z(t^)  » ^ would  be  the  vector 
of  sensor  outputs  on  a given  time  history  at  the  particular  time  t^^. 

Let  the  state  process  x(‘>')  of  the  system  model  satisfy  the 
linear  equation 

x(t)  = F(t)x(t)  + B(t)u(t)  + G(t)  w(t)  (1) 


where 


xt* » • )”n-vector  state  process 
-A 

_u(*)  ■ r-vector  of  piecewise  continuous  deterministic  control 

input  functions  (more  general  input  functions  are  possible, 
but  piecewise  continuous  is  adequate  for  practical  applications) 
wC»*)  = s -vector  dynamic  noise  process,  modelled  as  white  and 

'X, 

Gaussian,  with  mean  and  autocorrelation  for  all  t and  t' 
out  of  T given  by 


E {w(t)}  = 0 


E {w(t)w^(t')}  = ^(t)  6 (t-t') 


where  2.(*)  is  an  s-by-s  matrix  of  piecewise  continuous 
functions  (most  generally)  such  that  2^(t)  is  symmetric  and 
positive  semideflnite  for  all  t (i.  e.  all  eigenvalues  positive 
or  zero),  and  6(")  is  the  Dirac  delta  function 


F(*)“  n-by-n  system  dynamics  matrix  (of  piecewise  continuous 
functions  In  Its  general  form 


^(•)"  n-by-r  deterministic  Input  matrix 


G(‘)“  n-by-s  dynamic  noise  Input  matrix 


Note  that  If  F,  and  G are  In  fact  constant  matrices,  then  the 
system  dynamics  model  Is  termed  time-invariant,  rather  than  tlne-varylng . 
The  state  differential  equation  (1)  Is  propagated  forward 


from  the  Initial  condition  3t(t  ).  For  any  particular  operation  of 


the  real  system,  the  Initial  state  assumes  a specific  value  x(t^) . 


However,  because  this  value  may  not  be  known  precisely  aprlorl.  It 
Is  modelled  as  a Gaussian  random  vector.  Thus,  the  probabilistic 


description  of  x(t^)  is  coiiq>letely  specified  by  its  mean  x^  and 


covariance  P : 


E {x(t^)}  - x^ 


(3a) 


f\,  % 


(3b) 


where  Is  an  n-by-n  matrix  that  Is  symmetric  and  positive  semi definite, 


Allowing  ^ to  be  pcaltive  semldelinite , Instead  of  requiring  positive 
definiteness  (i.e.  all  eigenvalues  strictly  positive),  admits  the 
case  of  singular  the  case  In  which  some  Initial  states  or  some 

linear  coitblnatlon  of  initial  states  are  known  precisely. 


1 


O 


Measuremetito  are  available  an  discrete  tiiae  points  tj^,  tj,..., 
(often,  but  not  necessarily,  equally  space 1 in  titte) , 
and  are  modelled  by  the  relation  (for  all  t^  out  t»£  T ) : 


2(t^)  = li(t^)  x(tj^)  + v(t^) 

% 'V,  % 


where 


£('.•)  “ m-vector  discrete-time  measurement  process 


x('«')  = state  process;  x(t. ,*)  is  a random  vector 

t\j  'V 

corresponding  to  the  state  stochastic  process  at  the 


particular  sample  time  t^ 

v(*,*)  = m-vector  discrete-time  measurement  corruption  noise 


process,  modelled  as  white  and  Gaussian,  with 
statistics  (for  all  t^,  t^  taken  from  T): 


E {v(t^)  } = ^ 


E {v(t^)v^(t  )} 


R(ti)  t^  - tj 


'i’‘  h 


where  ^(t^)  is  an  m-by-m,  symmetric,  positive  definite 
matrix  for  all  t^ 

/ xA 

H(»)“  m-by-n  measurement  matrix 

In  this  description,  positive  definiteness  of  R(t^)  implies  that 
all  components  of  the  measurement  vector  are  corrupted  by  noise. 
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and  there  Is  no  linear  combination  of  these  coiiq>onents  that  would 
be  noise-free.  The  measurements  modelled  as  In  equation  (4), 
and  knowledge  of  our  own  Inputs  u,are  all  that  we  have  available 
from  the  real  system  under  consideration. 

It  Is  further  assumed  that  x(t  ),  w(*,*)  and  v(*,*)  are 

- — O ~ ” 

Independent  of  each  other.  Since  all  are  assumed  Gaussian,  this 
Is  equivalent  to  assuming  that  they  are  uncorrelated  with  each  other. 

It  Is  desired  to  combine  the  measurement  data  from  the  actual 

system  with  the  Information  provided  by  the  system  model  and 

statistical  description  of  uncertainties.  In  order  to  obtain  an 

"optimal"  estimate  of  the  system  state.  Under  the  modelling 

assumptions  made,  the  Kalman  filter  provides  state  estimates  that 

are  optimal  with  respect  to  essentially  all  criteria  used  to 

evaluate  estimators:  Bayesian,  maximum  likelihood,  weighted 

least  squares,  minimum  variance  unbiased,  and  maximum  a posteriori, 

to  name  the  most  common.  Some  assuiiq>tlons  can  be  relaxed  to 

yield  a more  general  problem  formulation,  yielding  restrictions  upon 

the  optimality  of  the  filter  or  modifications  In  the  algorithm  Itself 

(as  In  the  case  of  allowing  w( •,* ) and  v(*,')  to  be  correlated). 
However,  the  stated  assuiiq>tlons  are  the  most  conmon,  and  will  serve 

to  define  the  estimation  problem  for  this  report. 

Before  the  algorithm  Is  delineated.  It  will  be  convenient  to 
Introduce  some  additional  notation.  Define  a conq>oslte  vector 
which  coiq>rlses  the  entire  measurement  history  to  the  current  sample 
time  t^,  and  denote  It  as  ^(t^) : 
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This  Is  a vector  of  growing  dimension:  at  time  t^,  It  Is  a vector 

random  variable  of  dimension  (l*m).  Its  realized  value,  analogous 

to  z(t.  uv ) = 2,  Is  ^ : 

1*  * ~i»  “1 


(7) 


The  state  estimate  provided  by  the  Kalman  filter  at  time  t^, 
before  the  measurement  ^(t^^  Wj^)  = ^ Is  available.  Is  denoted  as 

x(t^“).  In  fact,  this  estimate  Is  the  conditional  mean  of  x(t^) , 
conditioned  on  the  previous  measurement  history  ^ * 


denoted  as  E {x(t^)  | 
* E {x(t^)  I 

'V  'V; 


(8) 


Also  provided  by  the  algorithm  Is  the  conditional  covariance  of  the 
state  conditioned  on  the  previous  measurement  history: 
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P(t^")  - E{[xCtj)  - x(t^~)]  IxCt^)  - X(t^-)]  T |z(t^_p  - _Z^_^} 


(9) 

A 

Since  x(t^“)  is  used  as  the  state  estimate,  this  is  also  the 
covariance  of  the  error  committed  by  the  filter  in  its  attempt  to 
estimate  the  state.  Moreover,  it  can  be  shown  that  P.(t^“)  is  also 
the  unconditional  covariance  of  this  error,  so  that  its  evaluation 
is  completely  Independent  of  the  particular  measurement  history  one 
sees  on  each  individual  use  of  the  filter. 

The  corresponding  state  estimate  at  the  same  time  t but 

i 9 

A 

after  incorporating  the  measurement  ^ is  denoted  as 

x(t^+)  - E{x(t^)  j Z(t^)  - (10) 

f\, 

The  associated  error  covariance  matrix  is  then 


EilxCt^)  [x(t  ) - x(t^+))^|z(t^) 


(11) 


The  algorithm  is  composed  of  two  parts,  one  for  updating  the 
state  estimate  and  error  covariance  when  a measurement  becomes  available, 
and  the  other  for  propagating  those  values  to  the  next  sample  time. 

A 

First,  assuming  x(t^“)  and  P_(t^“)  to  be  already  computed,  the  measurement 

A 

^ will  be  processed  to  yield  x(tj^''’)  and  ^(t^"*") . Then  the  relations 

A 

for  determining  x(tj^^j^”)  and  P.(t^^2~^  will  be  evaluated,  to  complete 
one  cycle  of  the  recursion. 


The  update  equations  are 

K(t^)  - P(tj^")H^(t^)  [H(t^)P(t^-)H^(t^)  + R(t^)]“^ 


(12) 


x(t^''’)  = x(t^")  + K(t^)  - H(t^)x(t^“)l 


(13) 


P(t^+)  = P(t^")  - K(t^)H(t^)_P(t^-) 


(lA) 


The  K(t^)  evaluated  in  equation  (12)  is  the  Kalman  filter  gain. 
"High  gains"  would  be  caused  by  "large"  P.(t^“)  - large  uncertainty 
in  the  output  of  the  filter’s  propagation  model  (to  be  discussed; 
due  to  "large"  Q^) , or  by  "small"  R(t^)  - "small"  corruption  on  the 
measurements.  In  either  case,  the  state  estimate  in  equation 


(13)  is  strongly  influenced  by  the  measurement  data  In 


this  equation,  the  bracketted  term  (the  "innovation"  or  "residual") 

is  the  difference  between  the  data  z.  obtained  from  the  actual 

—1 

measuring  devices,  and  the  best  prediction  of  its  value  before  it 


becomes  available:  [H(t^)3i(t^~)  ] , 


To  propagate  to  the  next  sample  time,  the  following 


differential  equation  can  be  integrated  numerically: 


x(L)  = F(t)x(t)  + B(t)u(t) 


P(t)  = F(t)P(t)  + P(t)F'^(t)  + G(t)^(t)G'^(t) 


(15) 

(16) 


starting  from  the  initial  conditions  x(t^  ) and  ^(t^  ) respectively. 


Tne  result  of  this  integration  would  be  x(tj_|^j^)  and  P(t^~j^)  . A direct 
solution  can  be  written  in  the  form  of 

-^‘^i+1^  " -^‘^i+l’‘^i^-^‘^l  ^ ■'■J  l<t^^^,T)B(T)u( 


(T)dT 


(17) 


(17) 


^1+1 

+ 1 i(ti+l*T)G(T)S.(T)C^(T)|  (tj^l.T)dT 


where  ^ n-by-n  state  transition  matrix  that  satisfies 


_^$(t,  t ) - F(t)  4 (t,  t ) 
dt  " ^ ^ 


for  all  t in  the  interval  from  t^  to  starting  from  the 

initial  condition 


l(t^,t^)  - I 


Note  that,  if  ^(t)  is  held  constant  over  each  sample  period 

(as  would  be  the  case  for  _u  being  generated  by  a digital  controller 

operating  with  the  same  sample  period  as  the  filter),  then  equation 


(17)  simplifies  to 


-^^i+1^  ” — ^^1+1*^1^ 


x(ti+)  + [ I 


i(ti^l.'c)B(T)dT]u(ti)  (20) 


Equations  (18)  and  (20)  are  propagation  difference  equations. 

They  are  the  proper  relations  not  only  for  a state  model  described 
by  the  stochastic  differential  equation  (1),  but  also  a discrete-time 


I 

i 

f 

I 


r 

i 


i 

! 


I 


i 


I 

I 


i 

I 


Equation  (21)  Is  often  termed  an  "equivalent  discrete-time 


model"  to  the  stochastic  differential  equation  (1).  It  Is  j 
"equivalent"  In  the  sense  that  the  statistical  characterization  j 
of  the  process  defined  by  (21)  to  (2A)  and  Initial  condition  (3), 

i 

is  Identical  to  properties  of  the  process  described  by  (1)  to  (3)  ] 
when  viewed  only  at  the  saiqple  times  tj^.t^*.**  etc.  In  the  j 
following  sections,  the  system  model  and  filtering  algorithms  | 
will  be  posed  in  terms  of  this  "equivalent"  discrete-time  system  model.  j 


Although  the  Kalman  filter  is  the  theoretically  optimal 
solution  to  the  filtering  problem  described  in  the  previous  section, 
the  algorithm  itselt  is  prone  to  serious  numerical  difficulties. 

Measurement  updating  of  the  covariance  matrix  by  equation  (14)  can 
involve  the  small  difference  of  large  numbers,  especially  in  the 
case  of  very  precise  measuring  devices.  With  finite  wordlength, 
this  can  cause  severe  truncation  and  precision  problems.  Moreover, 
the  Kalman  filter  algorithm  can  be  shown  to  be  numerically 
unstable,  so  that  once  such  problems  arise,  their  effects  propagate 
rather  than  die  out . 

Thus,  rather  long  wordlength  is  often  required  to  maintain 
acceptable  numerical  accuracy.  For  small  on-line  computers, 
double  precision  computations  are  usually  required.  In 

fact,  without  double  precision  arithmetic,  these  numerical  characteristics 
can  easily  become  the  dominant  error  source  corrupting  estimation 
precision,  and  unfortunately  an  error  source  usually  not  included 
in  designers’  error  budgets. 

The  difficulties  encountered  in  converting  a tuned  Kalman  filter  on 
a long  wordlength,  large  computer  system  used  for  engineering  design  to  an 
effective  algorithm  on  a smaller  wordlength  online  computer  are  well  docu- 
mented.  For  instance,  although  it  is  theoretically  impossible  for  the 
covariance  matrix  to  have  negative  eigenvalues,  such  a condition  can,  and 
often  does,  result  due  to  numerical  computation  using  finite  wordlength, 
especially  when  (1)  the  measurements  are  very  accurate  (eigenvalues  of 

12 


J 


are  small  relative  to  those  of  P.(t“) , this  being  accentuated  by  large 
eigenvalues  in  or  (2)  a linear  combination  of  state  vector  components  Is 
known  with  great  precision  while  other  combinations  are  nearly  unobservable 
(i.e.,  there  is  a large  range  of  magnitudes  of  state  covariance  eigenvalues). 

Such  a condition  can  lead  to  subsequent  divergence  or  total  failure  of  the 
recursion.  On  close  inspection,  even  Kalman  filters  that  maintain  adequate 
estimation  accuracy  exhibit  instances  of  negative  covariance  diagonal 

3i 

terms . 

To  circumvent  these  problems  in  numerics  inherent  to  the  Kalman  filter 
algorithm,  alternate  recursion  relationships  have  been  developed  to  propa- 
gate and  update  a state  estimate  and  error  covariance  square  root  or  inverse 
covariance  square  root  Instead  of  the  covariance  or  its  inverse  themselves. 
Although  equivalent  algebraically  to  the  conventional  Kalman  filter  recursion, 
these  "square  root  filters"  eidiibit  improved  numerical  precision  and 
stability,  particularly  in  ill-conditioned  problems  (i.e.,  the  cases 
described  that  yield  erroneous  results  due  to  finite  wordlength) . The 
square  root  approach  can  yield  twice  the  effective  precision  of  the  conven- 
tional filter  in  ill-conditioned  problems.  In  other  words,  the  same  precision 
can  be  achieved  with  approximately  half  the  wordlength.  Moreover,  this 
method  is  completely  successful  in  maintaining  the  positive  semidefiniteness 
of  the  error  covariance. 

These  excellent  numerical  characteristics,  combined  with  modest  additional 
computation  cycle  time  and  memory  storage  requirements,  make  the  square  root 
filter  approach  a viable  alternative  to  the  conventional  filter  in  many 
applications,  especially  when  computer  wordlength  is  limited  or  the  estima- 
tion problem  is  ill-conditioned.  The  formulation  of  the  square  root  filter 
for  the  case  of  no  dynamic  noise  is  especially  attractive  because  of  its 
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computational  simplicity,  and  its  outstanding  numerical  characteristics  led 
to  its  implementation  in  the  Apollo  spacecraft  navigation  filters. 

A number  of  practitioners  have  argued,  with  considerable  logic,  that 
square  root  filters  should  always  be  adopted  in  preference  to  the  standard 
Kalman  filter  recursion,  rather  than  switching  to  these  more  accurate  and 
stable  algorithms  when  and  if  numerical  problems  occur.  Even  though  Kalman 
algorithms  can  be  made  to  work  in  most  applications,  by  using  double  pre- 
cision mathematics  or  scaling  variables  to  reduce  dynamic  range  or  employing 
ad  hoc  modifications,  numerics  degrade  performance  from  that  achievable  by 
numerically  better  conditioned  recursions.  Recent  investigations  tend  to 
support  an  approach  of  designing  and  tuning  an  optimal  filter  by  the  methods 
of  the  two  previous  chapters,  ignoring  the  errors  caused  by  numerics,  but 
then  implementing  the  square  root  equivalent  for  online  operation.  Neverthe- 
less, one  can  expect  conventional  Kalman  algorithms  to  be  applied  rather 
extensively  as  well. 

Section  3 introduces  the  concept  of  matrix  square  roots,  and  then 
Section  4 develops  the  initially  designed  and  simplest  covariance  square 
root  filter,  applicable  to  the  case  of  no  dynamic  driving  noise  and  scalar 
measurements.  The  succeeding  two  sections  generalize  these  results,  first 
incorporating  vector-valued  measurements  and  then  allowing  dynamic  driving 
noise.  In  Section  7,  the  square  root  counterpart  to  the  Inverse  covariance 
formulation  of  the  optimal  filter  is  considered.  Although  it  is  not  actually 
a square  root  filter,  the  U-D  covariance  factorization  filter  is  very  closely 
related  to  square  root  filtering,  and  it  is  depicted  in  Section  8. 

Finally,  Section  9 presents  the  tradeoff  of  numerical  advantages  and 
increased  computational  burden  of  the  square  root  filters. 


3.  Matrix  Square  Roots 

Let  A be  an  n-by-n,  symmetric,  positive  semide finite  matrix.  Then 
there  exists  at  least  one  n-by-n  "square  root"  matrix,  denoted  as  i^. 


such  that 


- - A 


In  fact,  there  are  many  matrices  which  satisfy  (25)  in  general.  The 

essential  idea  of  square  root  filters  is  to  replace  the  recursion  for  the 

error  covariance  P with  a recursion  for  its  square  root,  and  to  compute 

the  state  estimate  using  an  optimal  gain  calculated  in  terms  of  ^ Instead 

of  P^  itself.  To  motivate  this,  consider  the  scalar  case:  if  dynamic  range 

numerical  precision  problems  are  encountered  in  a filter  that  propagates 
2 

the  variance  P = o , the  problem  can  be  reduced  by  expressing  the  result 
in  terms  of  the  standard  deviation  o since  the  dynamic  range  will  be  effec- 
tively reduced  to  half  the  original  range.  This  basic  idea  can  be  generalized 
to  the  vector  case  by  defining  the  state  error  covariance  square  roots, 
before  and  after  measurement  incorporation  at  time  t^^,  as  S^(t^)  and  S_(t^) 
respectively,  via: 

S(tp  s’’(tp  4P(tp  (26) 

S(t^)  S^(t^)  4 P (t^)  (27) 

Similarly,  define  the  square  root  of  the  covariances  depicting  the  strengths 
of  discrete-time  white  Gaussian  noises  w, (•,•)  and  y(.  •)  as: 

3C(j  ^ 

(28) 

V(t^)v’'(t^)  4R(t^)  4 E{v(t^)v'^(t^)}  (29) 

The  covariance  square  roots  are  not  uniquely  defined  by  (26)  through  (29), 
and  square  root  filters  can  be  formulated  in  terms  of  general  matrix  square 
roots.  One  means  of  exploiting  this  fact  is  to  develop  algorithms  which 
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maintain  a particularly  attractive  square  root  form,  namely  an  upper  or 

lower  triangular  matrix  (with  all  zeroes  below  or  above  the  main  diagonal 

respectively),  thereby  requiring  computation  and  storage  of  only  n(n+l)/2 
2 

instead  of  n scalar  variables. 

This  lack  of  uniqueness  does  not  cause  difficulties  in  converting  from 
a problem  description  in  terms  of  initial  and  time  histories  of  ^nd 

R(t.)  to  corresponding  S , W,(t.),  and  V(t.)  values,  as  might  first  appear 

— 1 — O 1 “1 

to  be  the  case.  The  reason  is  that  any  positive  semideflnlte  matrix  can 
be  factored  into  the  product  of  a lower  triangular  matrix  and  its  transpose 
by  the  Cholesky  decomposition  algorithm.^  Although  (25)  does  not  uniquely 
define  /a,  a unique  Cholesky  lower  triangular  square  root  VR  can  be  defined 
such  that  % = A: 

0 olXj  •••MJ  Ajj  ••• 

0 0 A,„  A^„  A„ 


'12  22 


. In  2n  nn. 


A, _ A_„ 

12  22 


0 0 * • • 


^n  ^2n  ■■■  \ 


The  elements  of  the  Cholesky  square  root  matrix  can  be  generated  sequentially. 


for 

i“l ,2,. . . ,n. 

r 

- 

c 

A.,  - y /A 

L 

ii 

k=l  ^ 

i-1  

li  ■ 

^ ik 

k-1 

j*l 

i > i 


Thus,  A is  scanned  and  is  generated  in  the  order  depicted  in  Fig  1. 


»v 

Later  in  the  Carlson  filter  of  Section  6,  we  will  have  occasion  to 
seek  an  upper  triangular  Cholesky  square  root  such  that  v'k  /k  “ 

Such  a matrix  can  be  found  by  operating  ( 30  } backwards,  or  specifically, 
for  j»n,n-l, . . . ,1,  perform  the  following  computations: 

r 0 i > j 


jj 


n 

- I 

k=j+l 


n 


I 

k=j+l 


i=.i 


1 


( 31  ) 


is  thus  generated  column  by  column,  from  the  n— th  column  to  the 
first  and  from  the  bottom  to  top  within  each  column,  as  in  Fig  2. 
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Ty.wr- 


4. 


4.  Covariance  Square  Root  Filter  for  5^  = 0^ 

In  1964,  Potter  developed  a square  root  filter  implementation  for 
space  navigation  applications  in  which  there  was  no  dynamic  driving  noise 
in  the  system  model,  i.e.  time,  motivated  by  restricted 

wordlength  in  the  Apollo  Guidance  Computer.  For  this  case,  the  time  propa- 
gation in  a conventional  Kalman  filter  would  be  (neglecting  control  inputs) : 


/(ti+i.ti)  (32b) 

By  letting  P(t^)  = S(t^)S^(t^)  and  PCtJ+j^)  = l(^i+l^i^(bi+i) . (32b  ) can  be 
rewritten  as 

[l(t-^^)][s’’(t-^^)]  = [l(ti+l,t^)S(t^)][s’^(t^)/(t^^^.t^)] 

From  this  it  is  evident  that  the  appropriate  time  propagation  relations  for 
the  square  root  filter  would  be 

^^^i+1^  “ ±«^i+l.tj)i(t^)  (33a) 

S(t7+i)  = i(bj^^.i.t^)S(t^)  (33b) 

Because  of  his  particular  application,  Potter  confined  his  attention  to 

scalar  measurements.  The  covariance  measurement  update  for  this  case  is, 
since  H.(tj^)  is  a row  vector, 

p(t;J’)  = P(t-)  - P(t;)H^(t  ) H(t  )P(t7) 

H(t  )P(tT)ir(t  )+R(t  ) 

1 X X X /oiv 


Therefore,  one  can  write  this  as 

S(t+)s'^(t^)  = S(t-)[I-b(t^)a(t^)a’'(t^)]s’'(tp 
where  by  n-by-1  a(t^)  and  scalar  b(t^)  are  defined  by 
a(t^)  = s'’''(tT)H’^(t^) 

J(t^)a(t^)  + R(t^) 


(35)  . 


(36a) 


r 


Potter  showed  that  the  bracketted  term  in  (35)  can  be  factored  into 

I^-baa^]  “ - byaa"^]  [I  - byaa'^]'^  (37) 


where  Y is  a scalar  defined  by 


(38) 


1 + 

Substituting  this  into  (35)  yields  the  covariance  update  as 
S(t^)  =S(tp  [I  - b(tpy(t^)a(t^)J(tp] 

- S(t-)  - b(t^)Y(t^)S(tpa(t^)£’(t^)  (39) 

The  state  estimate  measurement  update  is  of  the  conventional  form,  but  with 
the  Kalman  gain  evaluated  as  [b(tj^)^(t~)a^(tj^)  ] . Thus,  the  measurement  update 


becomes: 


a(t^)  - s’'(tpH’^(t^) 

b(tp  = l/[J(t^)a(t^)+R(t^)] 


Y(t^)  - 1/[1  + ^(t^)R(tj^)] 

K(t^)  « b(t^)  £(t-)  a(t^) 

x(t^)  = x(t2)  + K(t^)  [z^-H(t^)x(tp ] 

S(t^)  - S(tp  - Y(t^)K(t^)  J(t^) 


(40) 


An  equivalent  form  that  is  often  employed  is: 


a(t^)  - /(tpH^^(t^) 


Ktj,)  * 


a(t^) 


a(t^)  + V(t^) 


6(t^)  « l/fa(t^)a(tj^)  ] 


£(t^)  - 6(t^)[S(tpa(t^)] 

x(t^)  = + l(tji){[a(t^)/o(t^)][z^-H(t^)x(t-)]} 

s(t^)  = s(tp  - £(tpJ(t^) 


(41) 
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Note  that,  even  if  S^(t“)  is  lower  triangular,  S^(t^)  will  generally 
not  be  lower  triangular  when  this  update  form  is  used.  A method  that 
preserves  the  lower  triangular  nature  of  the  covariance  square  root  will 
be  discussed  later. 


5. 


Vector-Valued  Measurements 


The  preceding  section  considered  scalar  measurement  updates. 

Bellantonl  and  Dodge^ extended  these  results  to  the  vector  measurement  case 
by  using  eigenvalue  decompositions,  but  their  algorithm  Is  Inefficient  for 
the  typical  case  In  which  the  measurement  vector  dlmenstlon  m Is  signifi- 
cantly less  than  the  state  dimension  n.  Andrews^ also  developed  an  update 
that  processed  an  m-dlmenslonal  measurement  vector  In  a single  update, 
without  requiring  dlagonallzatlon : 

A(t^)  » s’^(t-)H'^(t^) 

£(t^)  « ^'^(t^)A(t^)  + R(t^)  (42) 

x(t^)  = x(tp  + S(t2)A(tj^)[r^(t^)]V^(t^)[z^-H(tj^)i(tp] 

S(t^)  - sup  - S(tpA(tp[r^(tp]'^[Z(t^)+V(tp]“V(tp 

This  can  be  seen  to  be  a direct  extension  of  (41),  and  It  Is  more  effi- 
cient computationally  than  the  Bellantonl  and  Dodge  algorithm.  Processing 
a measurement  entails  a Cholesky  decomposition  of  an  m-by-m  matrix  to 
generate  ^(t^),  (the  extension  of  o(tj^))  and  Inversion  of  two  triangular 
m-by-m  matrices,  ^(tp  and  [^(t^)  + V(t^)]. 

For  thdi  covariance  square  root  filter,  the  most  efficient  means  of 
performing  a vector  measurement  update  Is  to  employ  the  Potter  scalar 
update,  (40)  or  (41),  ^ repeatedly  m times.  An  m-dlmenslonal  measure- 
ment vector  can  always  be  processed  equivalently  as  m scalar  measurements. 

If  R(tj^)  Is  diagonal,  the  ® components  can  be  treated  as  Independent  measure- 
ments and  processed  sequentially.  If  R(t^)  Is  not  diagonal,  the  procedure 
Is  somewhat  more  complicated.  First  the  Cholesky  decomposition  of  R(t^)  Is 
computed,  yielding  ^R( t^)  as  a lower  triangular  matrix.  Then  a transforma- 
tion of  variables  Is  used  to  convert 

?3 


J 


z(t  ) •=  H(t  )x(t  ) + v(t  ) 

•v  1 1 1 


I (t^)  = H*(t^)x  (t^)  + v*(t^) 


where 


^'^R(t  ) z*(t  ) » z(t  ) (45a) 

1 ^ X X 

H*(t^)  = H(t^)  (A5b) 

^(t^)  v*(t^)  - v(t^)  (45c) 

Note  that  (45c)  implies  that  v*(*,-)is  a unit  power  v^iite  Gaussian  noise, 
i.e.,  E{v*(t  )v*^(t  )}  = ^,  since 

E{v(t.)v’^(t^)}  = R(t^)  - E{5fcEp  v*(tpv*'^(t^) 

- 5i^)  E{v*(t^)v*’^(t^)} 

Thus,  the  components  of  z'(t.  ,a).)  = can  be  processed  one  at  a time 

X»  1 J X 

sequentially.  Moreover,  (45a)  and  (45b)  can  be  solved  to  yield  z^  and 
* 

H (t^)  by  simple  back  substitution  rather  than  matrix  inversion,  because 
'^vR(tj^)  is  lower  triangular  and  thus  the  j-th  component  of  is  a linear 
combination  of  the  first  j components  of  z^^^. 

As  noted  in  the  previous  section,  the  ^(t^)  matrix  generated  by  these 
update  forms  is  generally  not  lower  triangular,  even  if  £(tp  is. 


6.  Covariance  Square  Root  Filter  for  0 


If  dynamic  driving  noise  enters  the  system  model,  the  conventional 
Kalman  filter  propagates  the  covariance  matrix  from  one  measurement  time 
to  the  next  by  means  of: 

f<tt)  t,)  + 

where  this  might  be  an  equivalent  discrete-time  representation  of  a continu- 
ous-time system  with  sampled  output  (in  which  case  = X).  Now  we 

wish  to  develop  an  analogous  recursion  to  yield  in  terms  of  £(t^). 

It  would  be  desirable  to  generate  a lower  triangular  since  then 

2 

only  1/2  n(n+l)  elements  would  require  computation  rather  than  n . 

One  means  of  achieving  the  desired  result  is  called  the  Matrix  RSS 
(Root-Suin-Square)'°method; 


This  method  actually  computes  P(t“^j^)  and  then  generates  as  its  lower 

triangular  Cholesky  square  root.  Although  this  is  a rapid  method,  it  does 
suffer  in  having  only  the  same  numerical  precision  as  the  conventional  filter 
time  propagation.  Nevertheless,  since  it  is  the  measurement  update  and 
not  the  time  propagation  that  causes  the  critical  numerical  problems  in  the 
filter,  (A7)  may  well  be  acceptable  for  many  applications. 

The  other  means  of  establishing  the  time  propagation  relations  is  called 
the  Trlangularization  Method T In  Section  4,  the  desired  result  (33b) 
was  established  by  writing  both  sides  of  the  covariance  propagation  (32b) 
in  terms  of  a factor  times  its  own  transpose,  and  then  equating  the  indl- 


vidual  factors.  Let  us  attempt  to  apply  the  same  logic  to  (^6)-  Assume 
that  the  square  roots  of  available:  for  ^ and 

for  all  tj^,  a Cholesky  decomposition  could  be  used,  and  for  P(t^)  in  general, 
assume  the  square  root  has  been  propagated  and  updated  by  the  filter  algo- 


rithm. Thus,  we  have: 


P(t^)  - S(t^)  S^(t^) 


Note  that  ^(t^)  need  not  be  lower  triangular  (important  in  view  of  the 
preceding  section).  Equation  (46)  can,  therefore  be  written  as 


(48a) 

(48b) 


+x..T,.+.  T 


(49) 


Now  it  is  desired  to  find  the  propagation  equation  for  the  square  root  of 

,T 


P(t~_j^l):  to  find  the  relation  to  yield  £(t"^j^)  such  that  £(t-_|^j^)^  ^*^i+l^ 
is  equal  to  the  right  hand  side  of  (49) , 

One  such  matrix  would  be  defined  by 

I(t-^l)  “ [±(tj^.j^.t^)S(t^)  I G^(t^)W^(t^)]  (50) 

However,  if  £(t^)  is  n-by-n,  then  [|.(t^^j^,t^)£(t^)  ] is  n-by-n  and 
[^(t^)^(t^)]  is  n-by-s,  so  would  be  an  n-by-(n+s)  square  root 

of  P(t^_j_j^).  Since  this  type  of  square  root  increases  the  dimension  of  the 
covariance  square  root  matrix  for  each  propagation  interval,  it  must  be 
rejected  as  computationally  impractical. 


However,  this  does  in  fact  provide  a fruitful  insight.  If  £(t“^j^) 


is  a square  root  of  P(t"^j^),  then  so  is  [S(t~^j^)T]  if  T is  an  orthogonal 


(ni-s)-by-(n+s)  matrix,  i.e.  T = _I , since 
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Therefore,  if  an  orthogonal  matrix  T can  be  found  such  that 

I 0 ] }n  rows  (52) 

n colusns  s colunns 

then  in  fact  an  n-by— n square  root  matrix  will  have  been  found  which 

satisfies  the  desired  relationship.  If,  in  addition,  this  ^(t“^j^)  were  lower 
triangular,  the  result  would  be  especially  advantageous.  Two  methods’^of 
generating  such  a » known  as  trlangularizatlon  algorithms,  are  the 

Gram-Schmldt  orthogonalization' procedure  and  the  Househol<fer  transformation^ 

technique.  Note  that  the  same  procedure  could  also  be  applied  to 
• [I-K(t^)H(t2)]P(tp[I-K(t^)H(t^)]’‘^  + K(t^)R(t2)K''’(t^) 
or 


p"^(t^)  - pr^(t-)  + H^(t^)R”^(t^)H(t^) 


for  vector  measurement  updates;  the  latter  of  these  will  be  discussed 
subsequently. 

First  let  us  demonstrate  that  the  Gram-Schmidt  procedure  yields  the 
desired  result.  Let  e denote  the  n-dlmensional  vector  composed  of  all 
zeroes  except  for  a one  as  the  k-th  component,  so  that  . . . , e” 

form  the  standard  basis  for  n-dimenslonal  space.  Then  [£ 
just  the  k-th  colunn  of  £ dimension  (n+s): 


) 


n columns 


Mn+s)  rows 


(53) 


where 


f = f (ti+i)  £ 


-n  . n 

s ' S e 


Construct  the  orthonormal  basis  vectors  b^,  . . . , b"  (each  of  dimension 
(n+s))  by  the  Gram-Schmidt  procedure  as; 

=•  unit  I 

u2  /~2  r-Zri,,!,  I 

^ = unit  (s^  ~ ^ ]b  ) » (55 

/-S  r-3T,2,,2v 

^ • unit  ~ b Jb  - [§,  ^ 1^  ) 


If  P(t^)  is  positive  definite,  then  ? rank  n,  so  n such  orthog- 

onal unit  basis  vectors  can  be  obtained.  Now  the  desired  orthogonal  trans- 
formation matrix  ^ can  be  defined  as  the  (n+s)-by-(n-fs)  matrix 
.r.  rv-l  1.2  .n  ,n+l  v^+s. 

T * [b  ,^,...,b,^  j (56) 

where  b^,  b^,  . . . , b*^  have  been  computed  as  in  (55)  and  the  remaining  s 
columns,  b^^^,  ...,  b°^,  are  additional  orthogonal  unit  basis  vectors  of 
(n+s) -dimensional  space.  However,  it  will  be  shown  that  they  do  not  have 
to  be  computed  to  obtain  £(t~^j^),  so  their  generation  will  not  be  specified 
explicitly. 

T 

At  this  point,  T ^ written  as 
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1 

1 

— b^T  — 

. 

• 

...1  U .vn  1 

8 S • • • S 1 

1 

1 

L ' 

, ITU 
b £ 

. 2T«1 
b 8 


b"V 

b2Tg2 


b"V 

b^V 


(57) 


• nT^l  , nT^2 

b 8 b s 


l,(«+8)Tgl 


, (n+«)T-2 
D s 


. nT-« 
b s 


. (n+8)T-n 
D s 


*^T  X 2 n 

However,  since  the  rank  of  £ » £ » •••.  £ ^ span  Its 

range  space  while  are  orthogonal  to  this  spanning  set,  it 

follows  that  the  last  s rows  in  (57)  are  all  zeroes.  By  the  manner  in 
which  the  basis  vectors  were  chosen,  it  is  also  true  that 

1^3  - 0 k > j 

by  the  same  reasoning.  Thus,  (57)  becomes 


"lT.,1  ,lT-2  ulT^nl 

t £ £ £ £ £ I 


. 2T.,2 

b s 


T-vT  ' 

IS  (tj^l)=; 


b'Vi 


•n  rows 


f 


} 


s rows 


n columns 


(58) 


T 

The  upper  n-by-n  partition  of  this  matrix  is  just  the  £ we  have  been 

seeking,  so  that  £(^^^2^)  i®  in  fact  an  n-by-n  lower  triangular  matrix: 
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J 


I 

r: 

I 

'_i 

I 

I 


Ktl+i) 


.1T«1  ^ 

b £ 

.1T~2  .2T^''v 

b s b s 


,lT./n  ,2T-^n 
lb  s b s 


(59) 


An  efficient  computational  form  of  the  Gram-Schmldt  orthogonallzatlon 

^3  -i'* 

called  the  Modified  Gram-Schmldt  (MGS)  algorithm  has  been  shown  to  be 
numerically  superior  to  the  straightforward  classical  procedure,  i.e. 
less  susceptible  to  roundoff  errors.'  Moreover,  it  requires  no  more  arith- 
metic operations  than  the  conventional  Gram-Schmidt  procedure,  uses  less 
storage,  and  has  been  shown  to  have  numerical  accuracy  comparable  to 
Householder  and  Givens  transformations.  Generation  of  ^(t“^j^)  through  this 
recursion  proceeds  as  follows.  Define  the  initial  condition  on  A , an 
(n+s)-by-n  matrix,  as 


(60) 


k k 

Notationally  let  A^  denote  the  j-th  column  of  A . Perform  the  n-step 
recursion,  for  k=l,2, . . . ,n: 


C61) 


Note  that  on  successive  Iterations,  the  new  A^  vectors  can  be"wTitten  over 
, .k 

the  ^ vectors  to  conserve  memory.  At  the  end  of  this  recursion. 
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Notice  that  the  computational  algorithm  never  calculates  or  stores  T 
explicitly  in  generating  • 

i«i 

A Householder  transformation  can  also  be  used  to  solve  (52)  for 
the  square  root  matrix  Conceptually,  It  generates  T as 

X ■ ^ ^ ^ ^1 

It 

where  Is  generated  recursively  as 

_k  _ jk  k kX 
T = _!  - d u u 

k k 

with  the  scalar  d and  the  (n+s)-vector  u defined  below.  However,  the 

k 

computational  algorithm  never  calculates  these  T 's  or  ^ explicitly.  The 

If 

initial  condition  on  the  (n+s)-by-n  A is 


Again,  letting  ^ represent  the  j-th  column  of  A , perform  the  n-step 


recursion,  for  k*l,2,..,,n; 


k,  k . .k  . 


' 0 , j < k 

k k . ,k  . , 

"j  * *\k  ’ J 

*5k  . j ■ O' 


j « (k+l),...,  (n+s) 


0 j < k 

1 j - k 

Ir  VT  If 

“ Aj  , j - (k+l) n 

.k  k kl 

“A  “ u 2. 


At  stage  k,  the  first  (k-1)  columns  of  A are  zero  below  the  diagonal  of  the 

upper  square  partition,  and  u has  been  chosen  so  that  the  subdiagonal  elements 
k+l 

of  ^ will  be  zero.  After  the  n iterations  of  (64). 

(65) 

and  then  is  generated  as 

(66) 

3 2 

The  Householder  triangularization  requires  1/6 [4n  + 6n  (s+1)  + 2n] 

3 2 

multiplies,  l/6[4n  + 6sn  + 8n]  adds,  n divides,  and  n square  roots. 

This  is  1/6 t2n^  - 8n]  fewer  multiplies  and  1/6 [2n^  + 3n^  - lln]  fewer 
adds  than  required  by  the  Modified  Gram-Schmidt  algorithm.  However,  the 
MGS  algorithm  becomes  slightly  more  precise  numerically  as  the  residual 
size  Increases,  and  thus  is  a viable  alternative. 

A Householder  transformation  method  has  also  been  proposed  for 
performing  measurement  updates.  However,  this  has  been  shown  to  be  equiva- 
lent to  the  Potter  method  described  previously,  but  not  as  efficient 
computationally. 

Thus,  the  covariance  square  root  filter  (Potter  filter)  algorithm  can 
be  specified  as  follows.  The  propagation  of  the  state  estimate  from  one 
sample  time  to  the  next  is  given  by  (33a).  Covariance  square  root  time 
propagations  are  calculated  by  means  of  the  matrix  RSS  method  (47),  the 
MGS  algorithm  given  by  (60)  to  (62) , or  the  Householder  transformation 

as  in  (63)  to  (66).  Of  these,  the  latter  two  are  preferable  since  they 

are  more  accurate  numerically  than  the  computationally  efficient  first 


is  generated  as 
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nethod,  and  numerics  are  Che  basic  motivation  for  square  root  forms.  Measure- 
ment updates  would  be  processed  through  m iterations  of  the  Potter  algorithm 


(40)  or  (41).  If  the  R(t^)  matrix  is  not  diagonal,  the  transformation  of 
variables  given  by  (43)  to  (45)  must  first  be  performed. 

One  significant  drawback  of  the  covariance  square  root  filter  just 

described  is  that  the  triangularity  of  the  square  root  matrix  is  generally 

2 

destroyed  during  the  measurement  updating.  Consequently,  all  n elements 
must  be  computed  and  stored.  A more  recent  method  developed  by  Carlson' 
provides  substantial  improvement  in  both  computational  speed  and  required 
storage  by  maintaining  the  covariance  square  root  matrix  in  triangular 
form.  By  doing  so,  only  n(n+l)/2  memory  locations  need  be  allocated  for 
^(t^),  and  the  product  I^(t^^j^,tj^)^(t^)  ] for  the  subsequent  time  propaga- 
tion requires  only  half  the  usual  number  of  computations. 

Like  the  Potter  measurement  update,  the  Carlson  algorithm  processes 
vector  measurements  iteratively  as  scalars.  Therefore,  consider  the  general 
square  root  solution  to  (35); 

S(t^)  - S(tp[I-b(t^)a(t^)J(t^)]^^^ 

- S(t-)[I-a(t^)a'^(t^)/d(t^)]^/2  (67) 

where,  for  convenience,  d(t^)  has  been  defined  as  l/b(t^).  Assuming  ^(tp 

Y 2/2 

to  be  upper  triangular,  we  seek  a matrix  [I.-£(tj^)a^  (t^)/d(t^)]  such 
that  the  ^(t^)  computed  in  (67)  is  also  upper  triangular.  ITie  choice 
between  upper  and  lower  triangular  form  is  arbitrary,  governed  by  selecting 
either  forward  or  backward  recursion  algorithms  for  the  Cholesky,  Householder, 
and  Gram-Schmidt  procedures.  Upper  triangular  forms  are  motivated  to  some 
extent  by  state  vector  partitioning,  as  discussed  by  Carlson.' 
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The  desired  square  root  matrix  Is  in  fact  derived  by  means  of  an 


analytic  Cholesky  decomposition,  and  can  be  expressed  as 
] fO  •••  aj  ■)  Tci 

b.  - 0 a2  •••  a2  c 


° j 


where  a^  Is  the  k-th  component  of  a(t^) , and  bj^  and  Cj^  for  k*l,2,...,n 

will  be  described  presently.  However,  the  computational  algorithm  neither 

computes  this  square  root  explicitly  nor  requires  a matrix  multiplication 

as  in  (67)  to  generate  ^(t^) . The  algorithm  is  Initialized  by  setting 

the  scalar  d and  n'^vectors  e and  a as: 

o — o — 

do  - R(t^) 

So  = 0 (68) 

a - s’’(tpH^(t^) 

and  iterating  for  k-=l,2,...,n  on: 


- S-l  ^ “K 
’■k  - <\-l«k>' 


'k  - •k«\-l'>k>  i 

^ • ^-1  I 


In  the  recursion,  ^ denotes  the  k-th  column  of  £(tp,  and  both  it  and  ^ 
consist  of  zeroes  below  the  k-th  element.  After  the  n iterations,  ^(t^) 


Is  produced  as 
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v^lch  is  an  upper  triangular  matrix.  The  state  vector  update  is  then  given 


For  time  propagations,  Carlson  suggested  the  matrix  RSS  method,  (47), 
but  with  an  upper  triangular  Cholesky  square  root  as  generated  in  (31) 
replacing  the  lower  trlangxilar  form  in  (47).  However,  the  triangulariza- 
tion  methods  could  also  be  employed,  thereby  sacrificing  some  computational 
speed  for  increased  numerical  precision.  A Modified  Gram-Schmidt  algorithm 
can  be  written  by  initializing  A according  to  (60)  and  then  iterating 
for  k»n,n-l, . . . ,1  on: 


^ * -^^^kk^^i+l^ 

j-l,2,...,(k-l) 

^ ■ ^jk<"I+l^^k  J 

where  denotes  replacement  by  means  of  "writing  over  old  variables 


j 

7.  Inverse  Covariance  Square  Root  Filter 

The  Inverse  covariance  formulation  of  the  optimal  filter  Is  an 
algorithm  which  Is  algebraically  equivalent  to  the  Kalman  filter  but 
has  substantially  different  characteristics,  such  as  being  able  to 
Incorporate  unknown  Initial  conditions  and  being  more  efficient  If  the 
measurement  vector  Is  very  large  In  dimension  (m>n) . Now  we  consider  the 


square  root  filter  analog  of  such  a formulation. 


If  we  define  the  covariance  square  root  matrix  ) through 


(73) 


then  It  Is  consistent  that  an  Inverse  covariance  square  root  ^ 


defined  through 


A S"’'(t^'*‘)  S"^(t^'^) 


(74a) 


where  S ^ denotes  Similarly,  ^ ^ would  be  defined 


through 


P“^(t.“)  A S ’’(t^")  S'^(t^  ) 


(74b) 


To  develop  the  Inverse  covariance  square  root  filter,  first  consider 
the  measurement  update  equation  in  the  inverse  covariance  filter: 


P ^(t^"^)  - l“^(t^‘)  + H’^(t^)R”^(tj^)H(t^)  (75) 


Using  (29)  and  (74) , this  can  be  written  as 


^"^(t^"^)  = i"’’(t^")s"^(t^‘)  + H^(tj^)v"’^(t^)v“^(tj^)H(t^) 


(76) 


—1  + —1  + T —1  + 

We  now  seek  an  update  relation  for  £ (t^,  ) such  that  {[£  (t^  )]  (t^  )]} 

is  equivalent  to  the  right  hand  side  of  (76) . One  such  matrix  would  be 
the  (n+m)-by-n  matrix 


■“-1  + 
S ^(tj  ) 


v"^(t.)  H(t^) 


(77) 
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i > 

J fi 


' I 


i I 


1 


I 

i 


As  in  the  previous  section,  such  an  ) would  be  unacceptable  due  to 

the  Increasing  matrix  dimensions  it  would  cause.  However,  if  an 
orthogonal  matrix  T can  be  constructed  such  that 
q.  , , S ^(t.^)  } n rows 

T^r^(t/)  - (7fc 


i”’‘^(t^'^)T  - 0]  (78’) 

then  the  resulting  n-by-n  ^ desired  square  root  matrix. 

In  analogy  to  the  previous  development,  it  could  be  especially  beneficial 
if  the  S BO  generated  were  upper  triangular.  Either  the  Modified 

Gram-Schmidt  orthogonalization  procedure  or  the  Householder  transformation 
algorithm  can  be  employed  to  solve  for  the  desired  ^ and  this  will 

be  developed  in  detail  after  the  state  estimate  is  discussed. 

In  its  operation  the  Inverse  covariance  filter  does  not 
compute  a state  estimate  directly,  but  rather 

£(t^")  A P‘^(t^")  S(t^")  (79a) 

±Ct*)  A (79b) 

which  are  related  by 

(80) 

Analogously,  the  inverse  covariance  square  root  filter  does  not  generate 
an  estimate  of  the  state  explicitly,  but  instead  calculates 


a(t^~)  A x(tj^") 


a(t^^)  A S“^(t x(t/) 
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The  update  relationship  between  these  estimates  can  be  shown  to  be 

} n rows 
} m rows 


n rows  { 

T 

a(t^‘) 

m rows  { 

B(t  3 

• l" 

v‘^(t  32, 

— i 

- 1 -i 

(82) 


where  T is  the  same  orthogonal  matrix  as  in  (78),  and  ^(t^)  is  an 

m-dlmenslonal  vector  (the  residual  after  processing  the  measurement, 

[£^-H(t^)x(t^^) ])  that  need  not  be  calculated.  Since  the  first  n rows 
T 

of  ^ are  the  result  of  an  n-step  Gram-Schmidt  or  Householder  process, 
a(tj^  ) can  be  computed  without  knowledge  of  any  additional  portion  of  T 
than  that  generated  by  either  of  the  triangularlzatlon  algorithms  discussed 
previously. 

The  Modified  Gram-Schmidt  (MGS)  measurement  update  initializes  an 
k k 

(n4in)-by-n  matrix  A and  an  n-vector  ^ as 

(83a) 

(83b) 

Then  an  n-step  recursion  is  performed  that  is  identical  to  (61)  except 

for  two  additional  equations  for  eventual  generation  of  fi(tj|^);  for  k«l,2,...,n 


- r^(ti'^) 


s‘^(ti") 

v”^(t^)H(ti) 
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(87) 


r 


.n+l 


, b 


n+l 


— 1 


6(t.) 
— 1 


For  time  propagations  in  which  the  dynamic  driving  noise  is 
s-dimensional,  s scalar  recursions  analogous  to  the  Potter  measurement 
update  in  the  covariance  square  root  filter  are  performed.  Thus  ^(t^) 
is  assumed  diagonal,  perhaps  after  a change  of  variables,  and  the  effects 
of  are  incorporated  component  by  component.  Letting  be  the 

k-th  column  of  Gj(t^)  and  be  the  k-th  diagonal  element  of  the 

algorithm  becomes,  for  k *=  1,2,. ..,s: 


(88) 


Note  the  order  of  the  time  indices  on  the  state  transition  matrix  and 

K-1 


that  ^ ^^i+l’^i^*  After  the  first  of  the  s iterations  of 

(88),  — ^^i»*'i+l^  replaced  by  the  identity  matrix.  In  analogy  to  the 

covariance  square  root  filter,  a Householder  transformation  has  also  been 
proposed  for  performing  the  time  propagation,  but  it  has  been  shown  to  be 
equivalent  to,  but  less  efficient  than,  the  Potter  type  algorithm  given 
in  (88) 
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Thus,  in  the  Inverse  covariance  square  root  filter,  measurement 
updates  are  conducted  in  vector  form  through  a triangularization  procedure, 
and  time  propagations  Involve  iterative  applications  of  a Potter-type 
scalar  incorporation  algorithm.  This  is  in  direct  opposition  to  the 
covariance  square  root  filter.  As  a result,  its  time  propagations  are 
more  efficient  than  those  of  the  covariance  square  root  filter  for  the 
typical  case  in  which  the  state  dimension  is  much  greater  than  the 
dynamic  noise  dimension  s.  On  the  other  hand,  its  measurement  update 
is  more  efficient  only  when  the  measurement  dimension  m is  considerably 
greater  than  n.  Although  most  applications  have  shown  the  covariance 
square  root  filter  to  be  more  efficient  computationally,  there  are 
circumstances  (m»n»s)  under  which  the  inverse  covariance  square  root 
formulation  is  preferable.  Section  9 will  compare  the  various  forms 
explicitly. 


8. 


U-D  Covariance  Factorization  Filter 


I.  0,7,  13 


Another  approach  to  enhancing  the  numerical  characteristics  of  the 
optimal  filter  algorithm  is  knovm  as  "U-D  covariance  factorization." 

Rather  than  decomposing  the  covariance  into  its  square  root  factors 
as  in  (26)  and  (27)  this  method  expresses  the  covariances  before  and 
after  measurement  incorporation  as 

P(t  r)  - U(t  r)D(ti“)u’^(t^")  (89) 

nti"*")  - U(ti'^)D(ti'^)u’^(ti'^)  (90) 

where  the  £ matrices  are  upper  triangular  and  unitary  (i.e.,  ones  along 

the  diagonal)  and  the  D matrices  are  diagonal.  Although  covariance  square 

roots  are  never  explicitly  evaluated  in  this  method,  this  filter  algorithm 

1/2 

is  included  in  this  chapter  because: (1)  UD  corresponds  directly  to  the 
covariance  square  root  of  the  Carlson  filter  in  Section  6^  and  the 
Carlson  filter  in  fact  partially  motivated  this  filter  development,  and 
(2)  the  U-D  covariance  factorization  filter  shares  the  advantages  of  the 
square  root  filters  discussed  previously:  guaranteeing  nonnegativity  of 
the  computed  covariance  and  being  numerically  accurate  and  stable.  (Merely 
being  a square  root  filter  is  not  a sufficient  condition  for  ninnerical 
accuracy  and  stability,  but  the  algorithms  discussed  previously  do  have  these 
attributes.)  Like  the  Carlson  filter,  triangular  forms  are  maintained  so 
that  this  algorithm  is  considerably  more  efficient  in  terms  of  computations 
and  storage  than  the  Potter  filter.  Though  similar  in  concept  and  computa- 
tion to  the  Carlson  filter,  this  algorithm  does  not  require  any  of  the 
(nnri-s)  computationally  expensive  scalar  square  roots  as  processed  in  the 
former. 
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Before  considering  the  filter  algorithm  Itself,  let  us  demonstrate 
that,  given  some  F as  an  n-by-n  symmetric,  positive  semldeflnlte  matrix 


a unit  upper  triangular  factor  U and  diagonal  factor  D such  that  P •=  U D ^ 
can  always  be  generated.  Although  such  U and  D matrices  are  not  unique, 
a uniquely  defined  pair  can  in  fact  be  generated  through  an  algorithm 
closely  related  to  the  backward  running  Cholesky  decomposition  algorithm. 


(31).  This  will  be  shown  by  explicitly  displaying  the  result.  First,  for 
the  n-th  column 


D - P 
nn  nn 


U ./I  i - n 

i - n-l,n-2,...l 


in  nn 

Then  for  the  remaining  columns,  for  J « n-l,n-2, . . . ,1,  compute: 


) 


(91a) 


'±i 


i > J 
i “ j 


''ik  i-  j-l.j-2,...,l 


V (91b) 


This  is  useful  for  defining  the  required  U-D  factors  of  ^ and  the  ^ time 
history  for  a given  application. 


To  develop  the  filter  algorithm  itself,  first  consider  a scalar 


measurement  update,  for  which  H(t£)  is  1-by-n.  For  convenience,  we  drop 


the  time  index  and  let  P(t^”)  * P , P,(tj^)  ■ and  so  forth.  The 


Kalman  update 


P*”  - P"  -(p"h^)  •“  (HP”) 


(92) 


a-HP'n’^+R 


can  be  factored  as 


+ + +T  - - -T  1 - - _T  T _ _ _T 

UDU  “UDU  - J (UDU  H ) H £ D U 


- U"  [D*  (D"u“V)(D"irV)^]U”^ 


(93) 
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Thus,  ~ i.  V ] is  scanned  and  IJ  is  generated  column  by  column,  as 

depicted  in  Figure  3.  The  validity  of  the  terms  generated  in  (99)  can 

be  demonstrated  by  substituting  them  into  (91)'  and  showing  that  the  resulting 

[P.,]  matrix  is  in  fact  [D  - — v v ^]. 
ij  — a 

The  scalar  measurement  update  for  the  U-D  covariance  factorization 
filter  can  now  be  specified.  At  time  t^,  U^(tj~)  and  D(t^~)  are  available 
from  a previous  time  propagation  (to  be  discussed).  Using  the  measurement 
value  and  the  known  1-by-n  H(tj^)  and  scalar  R(t^),  one  computes 


f - u’^(tj")H’’(t^) 

j • 

a - R 

o 

Then,  for  k - l,2,...,n,  calculate  the  results  of  (99),  but  in  a more 
efficient  manner  as 


In  (101),  denotes  replacement,  exploiting  the  technique  of  "writing 

over"  old  variables  for  efficiency.  For  k*l,  only  the  first  three  equations 

need  be  processed.  After  the  n iterations  of  (101),  U(tj^)  and  D.(tj^)  have 

been  computed,  and  the  filter  gain  K(tj^)  can  be  calculated  in  terms  of  the 

n-vector  b made  up  of  components  b, , b_,  ...,  b computed  in  the  last 
“ i / n 


iteration  of  (101),  and  the  state  updated  as 


'kU  1 -'b/a 
1 — n 

xCt^'^)  -x(t^“)  +K(t^)  - H(t^)x(t  .')] 


(102) 


Vector  measurement  updates  would  be  performed  component  by  component, 
requiring  a transformation  of  variables  as  in  Sec  tion  5 if  R(t  ^)  is  not 
originally  diagonal. 


The  time  propagation  of  the  U-D  factors  employs  a generalized  Gram- 
Schmidt  orthogonallzatlon  to  preserve  numerical  accuracy  while  attaining 

Si 

computational  efficiency.  Given  the  covariance  time  propagation  relation 
^^’^i+l'^  “ + G^(t^)5^(t^)^^(t^)  (103) 


and  the  U-D  factors  of  JP(t^^),  we  desire  the  factors  ) and  D(t^_j_j^  ) 

— — T — 

such  that  IU(tj^^j^  ^—^^1+1  ^^i+1  equals  the  right  hand  side  of  (103)" 

Without  loss  of  generality,  assumed  diagonal,  since,  given  the 

T 

n-by-n  matrix  [Gj(t j^)gj(t^)Gj  (tj^)],  (91)  can  be  used  to  generate 

as  its  U-factor  and  Cjj(t^)  as  its  D-factor. 

If  an  n-by-(n+s)  matrix  Y(t^^j^  ) and  an  (n+s)-by-(n+s)  diagonal  matrix 
D(t^^j^  ) are  defined  as 

(104a) 
(104b) 

then  it  can  be  seen  that  [T(t^_l_j^  )]  satisfies  (103). 


D(t  +)  I 

i I 


: Sd( 


g ■ 
("il 
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Similar  to  the  development  of  (.^3)  to  (79)  of  Section  the 

desired  result  can  be  generated  through  a Gram-Schmidt  procedure  applied 
to  K^i+l  >•  The  only  significant  modification  is  that  the  inner  pro- 
ducts used  in  the  procedure  are  weighted  inner  products:  whereas  in  (55) 

the  inner  product  of  (a  colximn  of  ))  and  a basis  vector  ^ was 

1 T k i T - 

written  as  b ],  here  the  inner  product  of  ^ (a  column  of  Y )) 

and  a basis  vector  b^  would  be  written  as[^^ 

analogous  development  is  made,  D(t^^)  and  ^(t^^)  can  be  identified  as,  for 


j “ l,2,...n  and  k = j,j+l,...,n: 


- Ibh^  >i- 


“jj^'i+l  " ^ -'"i+l 

^k^’'i+l  ^ -^’'i+1  ^ 


(105a) 


(105b) 


As  in  Section  6,  the  actual  computational  algorithm  is  the  efficient, 
numerically  superior  Modified  Weighted  Gram-Schmidt  (MWGS)  method.  Thus, 
the  time  propagation  relations  are  to  compute  Y(t^^^  ) and  D(t^^j^  ) as  in 
(104),  and  initialize  n vectors,  each  of  dimension  (n+s),  through 


:•  £2  : — : = Y (t  ) 


and  then  to  iterate  on  the  following  relations  for  k*n,  n-1,  . . . , 1 ; 


= D(t  )a^^ 

■4,  “ ■^'^\k^‘^l+l  ^ 

“jk's+r’  ■ ^ 4 

-j  " -j  " '^jk^'i+i’^^ 


(Ckj  “ )aj^j;  j-1,2,., 


j - l,2,...,k-l 


As  before,  denotes  replacement,  or  "writing  over"  old  variables  to 
reduce  storage  requirements.  On  the  last  iteration,  for  k«l,  only  the 
first  two  relations  need  be  computed.  The  state  estimate  is  given  by 


^^^i+l'>  * i(t.^^,t^)£(t.'^)  ! (108) 


^ 1 


9 . Filter  Performance  and  Requirements 

The  algorithms  of  this  chapter  have  been  investigated  in  order  to 
implement  the  optimal  filtering  solution  given  by  the  Kalman  filter,  but 
without  the  numerical  instability  and  inaccuracy  of  that  algorithm  when 
processed  with  finite  wordlength.  In  this  section,  both  the  numerical 
advantages  and  the  increased  computational  burden  of  these  filters  will 
be  delineated. 

An  algorithm  can  be  said  to  be  numerically  stable  if  the  computed 
result  of  the  algorithm  corresponds  to  an  exactly  computed  solution  to 

■iC 

a problem  that  is  only  slightly  perturbed  from  the  original  one.  By 
this  criterion,  the  Kalman  filter  is  numerically  unstablet  in  both  the 
conventional  and  Joseph  formulations.  In  contrast,  all  of  the  filters 
described  in  this  chapter  can  be  shown  to  be  numerically  stable. 

The  numerical  conditioning  of  a set  of  computations  can  be  de- 
scribed in  part  by  what  is  called  a "condition  number",  a concept  which 
is  often  used  to  analyze  the  effects  of  perturbations  in  linear  equa- 
tions. If  A is  a matrix,  not  necessarily  square,  then  the  condition 
number  k(A)  associated  with  A is  defined  by  * 


f 


I 


I 

I 


k(A) 


a /a  . 
max  min 


find) 


0 0 T 

where  a ^ and  a . ^ are  the  maximum  and  minimum  eigenvalues  of  A A 
max  min  

respectively.  When  computing  in  base  10  (or  base  2)  arithmetic  with  N 


significant  digits  (or  bits),  rtumerlcal  difficulties  may  be  expected  as 
N N 

k(A)  approaches  10'  (or  2 ).  For  instance,  if  the  maximum  and  minimum 


numbers  of  Interest,  o and  a . , were  100000  and  000001  (in  base  10 
’ max  min’ 

or  2) , then  to  add  these  values  together  and  obtain  100001  without  nu- 
merical difficulties  would  require  at  least  6 significant  figures  (digits 
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or  bits) . But 


k(P)  = k(^^)  - [k(^)]2  (110) 

Therefore,  while  numerical  operations  on  the  covariance  ^ may  encounter 

N N 

difficulties  when  k(]P)  ~ 10  (or  2 ) , those  same  numerical  problems 
would  arise  when  k(^)  ••  10*^^^  (or  2^^^)  according  to  (110) : the  same 

numerical  precision  is  achieved  with  half  the  wordlength. 


EXAMPLE 


This  example  and  the  next  illustrate  the  improved  numerical 
conditioning  of  the  square  root  filters.  To  simulate  roundoff,  let 
e « 1 be  such  that 


1 + e ? 1 
1 + e2  i 1 

where  means  equal  due  to  rounding.  Consider  a scalar  measurement 
update  of  a two-state  problem,  with 

P(tp  * [J  j],  H(t^)  - [1  0),  R(t^)  - e2 


and  compare  the  computed  results  of  conventional  filters  and  those  of  this 
report.  Note  that 


P(tp  - P-l(t-)  - S(tp  - S-^Ctj)  - D(t-)  - D(t-)  - I 


and  that  the  exact  covariance  PCf^)  for  this  example  is: 


The  computed  results  are: 

(a)  Conventional  Kalman 

(b)  Joseph  Form  Kalman 

(c)  Potter  Covariance  Square  Root 

(d)  Carlson  Covariance  Square  Root 


(e) 

Inverse  Covariance 

(f) 

Inverse  Covariance  Square  Root 

r^(tp  E K; 

(g) 

U-D  Factor 

- [; ;] 

[f;] 

0 

1 


For  this  example,  all  but  the  conventional  Kalman  filter  yield  non- 
singular and  nearly  exact  answers.  Although  the  difference  between  0 
and  e^  in  the  upper  left  element  of  P(t^)  may  seem  insignificant,  it 
can  have  grave  consequences.  For  Instance,  assume  no  dynamics  and  let 
a second  measurement  of  the  same  form  be  taken.  The  gain  K computed  by 
the  conventional  Kalman  filter  would  be 

K(t^)  - P(t+)  H(t^)/  [H(t^)P(t^HT(t^)  + R(t^)] 

|/[0  + 1]  - 


whereas  the  correct  value  is 

K(t^)  = -si— 
1 + 


as  would  be  calculated  correctly  by  the  Joseph  form  in  this  case.  ■ 
EXAMPLE 

Consider  the  same  problem  as  in  the  previous  example,  but  let  H(ti) 
be  [1  1]  instead  of  [1  0].  In  this  case,  the  exact  answer  is 


1 ri  + e2  -1  1 
-I 


The  computed  results  are: 


(a)  Conventional  Kalman  r F 1 -l] 

P(t+)  - ^ jJ 

-i] 


(b)  Joseph  Form  Kalman 
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0 


(c)  Potter  Covariance  Square  Root 

(d)  Carlson  Covariance  Square  Root 

(e)  Inverse  Covariance 

(f)  Inverse  Covariance  Square  Root 

(g)  U-D  Factor 

In  this  case,  only  the  square  root  and  U-D  Implementations  yield  non- 
singular results.  Such  singular  or  matrices  would  again 

yield  a zero  gain  K If  a second  measurement  of  the  same  form  were  processed 
while  the  square  root  and  U-D  factor  filters  compute  a gain  which  is  nearly 
exact.  Moreover,  even  though  U,  and  ^ are  non-singular,  the 

associated  value  of  P.(t^)  or  (t^)  found  by  multiplication  would  be 

rounded  to  a singular  matrix;  thus  It  Is  better  not  to  perform  such 
computations  explicitly,  and  the  time  propagations  based  on  trlangular- 
ization  are  to  be  preferred  over  the  RSS  method  which  performs  such 
multiplication.  B 

The  improved  numerical  characteristics  of  the  square  root  and  U-D 
factorization  filters  are  achieved  at  the  expense  of  Increased  computa- 
tional burden.  Letting  n be  the  dimension  of  the  state  vector  ^ , s be 
the  dimension  of  the  dynamic  driving  noise  w,  and  m be  the  dimension  of 
the  measurement  _z  and  its  corruptive  noise  v^,  we  now  determine  the  number 
of  mathematical  operations  required  by  the  various  filters,  assuming  that: 

1)  all  implementations  take  advantage  of  symmetry  and  zeros  as 
they  appear  in  general  forms 

2)  R(tj^)  and  are  diagonal 

3)  the  inverse  covariance  and  inverse  covariance  square  toot  fil- 
ters generate  explicit  state  estimates,  x(ti)  and  x(ti). 
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Table  1 presents  the  number  of  operations  for  one  time  propagation  and 
one  measurement  update  required  by 


1.  Kalman  Filter  - with  conventional  and  Joseph  form  measurement 
update 

2.  Potter  Covariance  Square  Root  Filter  - with  MGS  and  Householder 
time  propagations 

3.  Carlson  Covariance  Square  Root  Filter  - with  Matrix  RSS  and  ' 

MGS  time  propagations  j 

4.  Inverse  Covariance  Filter  ] 

I 

5.  Inverse  Covariance  Square  Root  filter  (MGS  update)  j 

6.  U-D  Covariance  Factorization  Filter.  I 

t 

This  table  can  be  used  to  project  the  computation  time  required  by  each  ] 

filter  formulation  for  a given  application.  Note  that  if,  instead  of 
assuming  to  be  diagonal,  we  were  to  assume  that  the  n-by-n 

or  the  n-by-s  were  known,  there  would  | 

be  ^ ns  (n  + 3)  fewer  multiplies  and  k n (n  + 1) (s  - 1)  fewer  adds  in 
filter  forms  1,  3 with  RSS  time  propagation,  and  4,  or  ns  fewer  multiplies 

J 

in  forms  2 and  3 with  MGS  time  propagation.  i 

EXAMPLE  ' 

To  put  the  algebraic  expressions  of  Table  1 into  perspective. 

Table  2 presents  the  number  of  operations  required  for  one  time  propa-  ; 

gatlon  and  one  measurement  update  for  the  case  of  n ~ 10,  s ° 10,  and  | 

m » 2.  The  noise  dimension  s was  intentionally  set  equal  to  n to  cor- 
respond to  the  n-by-n  [Gj(t.)Qj(t  )GT(t  )]  being  of  full  rank,  typical 
of  an  equivalent  discrete-time  model?  The  last  column  in  Table  2 por- 
trays computer  time  required  for  one  total  filter  recursion,  neglecting 
the  computations  associated  with  the  various  subscripting  and  storage 

operations  for  each  filter  (roughly  the  same  for  each) , and  using  single  | 

precision  instruction  times  typical  of  the  IBM  360  and  some  smaller  state- 

of-the-art  computers:  | 

time  for  addition  “ 2.7  psec 

time  for  multiplication  = 4.1  psec 
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Table  1 Operations  Required  for  One  Time  Propagation 

and  One  Measurement  Update 


FILTER 

i ADDS  . 

! (all  times  ^'6) 

1 

1 

MULTIPLIES 
(all  times  ^'6) 

DIVIDES 

SQUARE 

ROOTS 

Conventional 

Kalman 

1 

j 9n^+3n^(3mfs-l) 
+n(15m4-38-6) 

9n3+3n^ ( 3m+8+3 ) 
+n(27iiri-9s) 

m 

0 

Joseph  Form 
Kalman 

18n^+3n2 (Smfs-lO) 

+n  (9m^+6nH-3s  ) 
+3m^-6m^3m 

18n3+3n^  (5m+s+4) 

+n  ( 9m^24mf  9s  ) 
+3m3+9m^-6m 

2m-l 

0 

Potter  Cov  V" 
(MGS)  ! 

1 

12n3+3n2(6mf2s) 

1 +n(6m-6)+6m 

12n3+3n2(6nri-28+2) 

+n(24nri-68)+12m 

n+2m 

n+m 

Potter  Cov  \ 

(Householder)  | 

i 10nMn^(6m«8-l) 
+n(6in+5)+6m 

10n3+3n2(6mf2s+2) 
+n  (24iiri-68+8)+12m 

n+2m 

r+m 

Carlson  Cov  j 

(RSS)  i 

i 

1 

5n^+3n^ (3mf  s+1) 

+n  (9nri-38-  14)+2s  3+48 

5n3+3n2  (4iiri-s+3) 
+nl(30m+98-2)+2s3 
+68^-28 

2mn+8 

mn+s 

Carlson  Cov  V 
(MGS) 

1 

9n3+3n^ (3m+s-l) 

\ +3n(3'!iH-3s-8)+2s3 
j +6s^48 

1 

9n3+3n^  (4iiri-8+2) 

+3n(10nH-58-7)+283 

+128^4s 

2mn+s 

mrH-s 

Inverse 

Covariance 

! 10n3+3n^(m+38+2) 

1 +n(9m+9s-16) 

10n3+3n^  (iirt-38+6) 
+n(15DH-2l8-10) 

2s-l 

0 

Inverse 

Cov  V" 

9n3+3n2(2m4-68+5) 

+n(12m+68-6) 

9n3+3n2(2iiri-68+6) 

+n(12m+248+3)+6s 

2nf2s 

n+s 

U-D 

Factor 

9n3+3n2(3m+28+2) 

+3n(3Bi+l) 

9n3+3n2(3m-2s+7) 

+3n(mf4s-4)-6s 

n(nri-l)-l 

0 
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Table  2 Operations  for  One  Total  Filter  Recursion; 

n **  s - 10  and  m •=  2 


FILTER 

ADDS 

MULTIPLIES 

DIVIDES 

SQUARE 

ROOTS 

TIME 

(Milllsec) 

Conventional 

Kalman 

2340 

i 

i 

2690 

2 

0 

17.36 

Joseph  Form 
Kalman 

3631 

1 

1 

4498 

3 

0 

28.27 

i 

Potter  Cov  V 
(MGS) 

I 3612 

1 

3884 

14 

12 

26l.49 

Potter  Cov  \/” 
(Householder) 

3247 

3564 

14 

12 

24.19 

i 

Carlson  Cov  V 
(RSS) 

2080 

2560 

50 

30 

18.24 

Carlson  Cov  V" 
(MGS) 

2830 

3355 

50 

30  i 

23.53 

Inverse  ! 

Covariance  i 

3520 

3950 

19 

0 

25.82 

Inverse 

Cov  V" 

5080 

5455 

40 

20 

37.55 

1 

0-D 

Factor 

2935 

3330 

29 

0 

1 

1 21.77 

time  for  division  * 6,6  jisec 

time  for  square  root  « 60,0  ysec 

As  can  be  seen  from  Table  2,  the  covariance  square  root  filters 
and  the  D-D  covariance  factorization  filter  involve  a computational  load 
greater  than  the  conventional  Kalman  filter,  but  not  so  great  as  to  be 
prohibitive.  In  fact,  the  Increase  is  less  than  that  caused  by  employing 
the  Joseph  form  of  the  update  equation,  which  is  inferior  to  these  fil- 
ters in  performance.  Moreover,  since  the  Kalman  filter  would  probably 
require  double  precision  operations  instead  of  the  single  precision 
assumed  to  establish  Table  2,  these  filters  are  even  more  competitive 
with  the  Kalman  filter  than  indicated  in  the  table. 

Of  the  square  root  type  filters,  the  Carlson  covariance  square  root 
and  the  D^D  covariance  factorization  filters  are  the  most  efficient  com- 
putationally. The  Carlson  filter  with  Matrix  RSS  time  propagations 
requires  the  least  computer  time,  but  this  is  offset  by  the  degraded 
numerical  accuracy  of  the  Matrix  RSS  method.  Thus,  the  U-D  covariance 
factorization  filter  would  appear  to  be  an  exceptionally  efficient  and 
numerically  advantageous  alternative  to  the  conventional  Kalman  filter 
for  this  particular  application.  B 


10.  Conclusion 


This  report-  presented  the  concept  of  square  root  filters  and  the 
closely  related  U-D  covariance  factorization  filter  as  viable  alternatives 
to  conventional  Kalman  filters.  For  a modest  increase  in  computational 
loading,  one  obtains  optimal  ^Iter  algorithms  equivalent  to  the  Kalman 
filter  if  infinite  wordlength  is  assumed,  but  with  vastly  superior  numerical 
characteristics  with  finite  wordlength.  From  a numerical  analysis  standpoint, 
this  is  at  least  as  good  a solution  to  troublesome  measurement  update  com- 
putations as  implementing  a Kalman  filter  in  double  precision,  since  the 
Kalman  filter  inherently  Involves  unstable  numerics. 

Of  the  covariance  square  root  forms,  the  Carlson  filter  is  more 
efficient  than  the  Potter  form  computationally,  and  it  also  maintains 
triangularity  of  the  square  root  matrices.  The  U-D  covariance  factori- 
zation filter  is  comparable  to  the  Carlson  filter  and  does  not  require 
square  root  computations.  In  comparison,  the  inverse  covariance  square 
root  filter  is  often  considerably  more  burdensome  computationally,  althou^ 
it  too  becomes  competitive  if  the  measurement  dimension  m is  very  large. 

Chandrasekhar  type  square  root  algorithms  have  also  been  reported 
in  the  literature.  However,  these  have  been  omitted  because  they  do  not 
appear  to  be  computationally  competiti-ve  with  algorithms  presented  herein 
for  the  nonstationary  linear  discrete-time  estimation  problem. 


! 

58 


k . J 


BIBLIOGRAPHY 


f 


\ 

I 

I 

t 

fc 

r 


1.  Agee,  W.  S,,  and  R.  U.  Turner,  "Triangular  Decotnposition  of 

a Positive  Definite  Matrix  Plus  a S^nainetrlc  Dyad  vflth  Applications 
to  Kalman  Filtering,"  Mathematical  Services  Branch,  Analysis  and 
Computation  Division,  Tech.  Report  38,  White  Sands  Missile  P.ange, 

1972. 

2.  Andrews,  A.,  "A  Square  Root  Formulation  of  the  Kalman  Covariance 
Equations,"  AIAA  Journal.  Vol.  6,  No.  6,  June  1968,  pp.  1165-1166. 

3.  Bellantoni,  J.  F.,  and  K.  W.  Dodge,  "A  Square  Root  Fonaulatlon  of 
the  Kalman  - Schmidt  Filter,"  AIAA  Journal.  Vol.  5,  No.  7, 

July  1967,  pp.  1309-1314. 

4.  Blerman,  G.  J.,  "A  Comparison  of  Discrete  Linear  Filtering 

Algorithms,"  IEEE  Transactions  on  Aerospace  and  Electronic  Systems. 
Vol.  AEr.-9,  No.  1,  Jan  1973,  pp  28-37.  ‘ 

5.  Blerman,  G.  J.,  "Sequential  Square  Root  Filtering  and  .'imoothlng 

of  Discrete  Linear  Systems,"  Automatlca.  Vol.  10,  1974.  pp.  147-158. 

6.  Blerman,  G.  J.,  "Measurement  Updating  Using  the  U-D  Factorization," 
Proceedings  of  the  1975  IEEE  Control  and  Decision  Conference. 

Houston,  Texas,  Dec  1975,  pp.  337-346. 

7.  BieTanan,  G.  J.,  Factorization  Methods  for  Discrete  Sequential 
Estimation.  Academic  Press,  New  York,  N.  Y.,  1^76. 

8.  Bjorck,  A.,  "Solving  Linear  Least  Squares  Problems  by  Gram- 
Schmidt  Orthogonallzation, " BIT.  Vol.  7,  1967,  pp  1-21. 

9.  Buslnger,  P.,  and  G.  H.  Golub,  "Linear  Least  Squares  Solution 
by  Householder  Transformations,"  Numerische  Mathematik. 

Vol.  7,  1965,  pp.  269-276. 

10.  Carlson,  N.  A.,  "Fast  Triangular  Formulation  of  the  Square 

Root  Filter,"  AIAA  Journal.  Vol.  11,  No.  9,  Sept  1973,  pp.  1259-1265. 

11.  Dyer,  P.,  and  S.  McReynolds,  "Extension  of  Square-Root  Filtering 
to  Include  Process  Noise,"  Journal  of  Optimization  Tlieory 

and  Applications.  Vol.  3,  No.  6,  1969,  pp.  444-459. 

12.  Fadeeva,  V.  N.,  Computational  Methods  of  Linear  Algebra. 

Dover  Publications,  Inc.,  New  York,  N.  Y.,  1959. 

13.  Fletcher,  R. , and  M.  J.  D.  Powell,  "On  the  Modification 
of  LDL^  Factorizations,"  Mathematics  of  Computation. 

Vol.  28,  No.  128,  Oct  1974,  pp.  1067-1087. 


59 


r 


14.  Ccntlenan,  V«,  M. , "'Least  Squares  Coinputs tions  by  Givens 
Trans fcrtiatloas  without  Snuare  Roots,"  J.  Inst.  ySjsth. 

Appllc..  Vol.  12,  197’,  pp.  329-336. 

15.  Gill,  P.  F,,  G.  H.  Golub,  V,  Murr.iy,  and  M.  A.  Saunders, 

"Methods  for  Modifying  Matrix  Factorizations,"  Mathematics 
of  Computation.  Vol.  28,  No.  126,  1974,  pp.  505-535. 

1 6.  Gill,  P.  E.,  W.  Murray,  and  M.  A.  Saunders,  "Methods  for 
Computing  and  Modifying  the  LDV  Factors  of  a Matrix,"  Mathematics 
of  Computation.  Vo].  29,  No.  132,  Oct  1975,  pp.  1051-10'77. 

17.  fiolub,  G.  H. , "Numerical  Methods  for  Solving  Linear  Least 

Squares  Problems,"  Numerlsc.he  Ma.thematik . Vol.  7,  1965,  pp.  206-216. 

18.  Hanson,  R.  J.,  and  C.  L.  Lawson,  "Extensions  and  Applications  of 
the  Househo-Tdar  Algorithm  for  Solving  Linear  Least  Squares 
Problems,"  Mathematics  of  Computation,  Vol.  23,  No.  108,  Oct  1969, 

po.  787-812’. 

19.  Householder,  A.  S.,  The  Theory  of  Matrices  in  Numerical  Analysis. 
Blaisdell  Publishing  Co.,  Waltham,  Mass.,  1964. 

20.  Jordan,  T. , "Experiments  on  Error  Growth  Associated  with 

Some  Linear  Least  Squares  Procedures,"  Mathematics  of  Comoutatlon, 
Vol.  22,  1968,  pp.  579-588. 

21.  Kalman,  R.  E.,  "A  New  Approach  to  Linear  Filtering  and 
Prediction  Problems,"  Transactions  of  the  ASME,  Vol.  82D,  Mar  1960, 

pp.  35-50. 

22.  Kaminski,  F.  G. , A.  E.  Bryson  Jr.,  and  S.  F.  Schmidt, 

"Discrete  Square  Root  Fllterlcg:  A Survey  of  Current  Techniques," 
IEEE  Transactions  on  Automatic  Control.  Vol.  AC-16,  No.  6,  Dec  1971, 
pp.  727-735. 

23.  Lawson,  C.  L. , and  R.  J.  Hanson,  Solving  Linear  Least  Squares 
Problems.  Prentice-Hall,  Inc.,  Englewood  Cliffs,  N.J.,  1974. 

24.  Maybeck,  P.  S.,  "The  Kalman  Filter:  An  Introduction  for  Potential 
Users,"  Air  Force  Flight  Dynamics  Laboratory'  Report  TM-72-3, 

Wright  Patterson  AFB,  OH,  June  1972. 

25.  Maybeck,  P.  S.,  Stochastic  Estimation  and  Control  Systems. 

Air  Force  Institute  of  Technology,  Wrlght-Patterson  .\FB,  OH, 

Feb  1975. 


26.  Korf,  K.,  a.ai  T.  .Cailath,  ’'Squ'<ce-iloot  Aii,,-"’ r tlaie  for  Leaut- 
Cquaras  Estimation,"  ISiSE  Traisactions  oa  AuLoni-c.'' c Ccritrol. 

Vol.  AC-20,  ITo.  4,  Au{;  1J75,  pp.  487-497. 

27.  Potto'r,  J.  Ti.,  iiatrlvc  AujiHi-nitatlon,''  I-UT  T.U3t;rirnentct:ic-T. 
:.£c«criitory  Kamr-  "iGa  5-64,  Jan  1964. 

28.  Practical  i^sj.acts  of  Kalm&a  Filter tn«  Istplette.itatioii. 

NATO  Advisory  Group  for  Aerospace  Research  and  JJavHlopitent, 
AGABE-LS-82,  London,  Mar  1976. 

29.  Klee,  J.  ?..,  "Experfments  on  Gram-Schnidt  Crthogcuallzatlon," 
Mathematics  of  Comnutation,  Vol.  20,  1S66,  pp.  325-328. 

30.  Schmidt,  S.  F.,  "Compiitetional  Techniques  in  KalKian  rij«eting," 
Chapter  3 in  Theory  and  Applications  of  Kalman  FilteriaA, 
AGARDograph  129,  London,  Feb  1970. 

31.  Tlieory  and  Ap.:  11  cations  of  Kalman  Filtering,  NAjX)  .Advisory 
Grouo  for  Aerospace  Reisearch  and  development,  AGARDograph  139, 
London,  Feb  1970. 

32.  Thornton,  C.  L.  , and  G.  J.  Biermsn,  "Gi'air.-Schirldt  Algorithms  for 
Covariance  Propagation,"  Proceedings  of  the  1975  IEEE  Control 
and  Decision  Conference.  Kouaton,  Texas,  Dec  1975,  pp.  489-498. 

33.  Thornton,  C.  L. , and  G.  J.  Bierman,  "A  Numerical  Conqjarison 

of  Discrete  Kalman  Filtering  Algorithms:  An  Orbit  Determination 
Case  Study,"  preprint  of  paper  submitted  for  publication. 

34.  Wampler,  R.  H.,  "A  Report  on  the  Accuracy  of  Some  Widely 
I'sed  Least  Squares  Computer  Programs ,"  Journal  of  the 
American  Statistical  Association,  VqL.  65,  No.  330,  1970, 
pp.  549-565. 

35.  Wilkinson,  J.  H. , Rounding  Errors  in  JJgebraic  Processes. 
Prentlce-Kall,  Inc.,  Englewood  Cliffs,  N.J.jlIfS. 


61 


secuniTv  classification  of  this  pace  dm*  «/»*»</; 


REPORT  DOCUMENTATION  PAGE 


EPONT  number 


READ  INSTRUCTIONS 
BEFORE  COMPLETING  FORM 


2.  GOVT  ACCESSION  NO.I  3-  RECIPIENT'S  CATALOG  NUMBER 


$.  TYPE  OF  REPORT  A PERIOD  COVERED 


^ Peter  S./Maybeck  J 
ABaoetacr^njteggw  of 
Electrical  Engineering 


S.  PERFORMING  ORGANIZATION  NAME  AND  ADDRESS 

Air  Force  Institute  of  Technology  (AFIT-EKG) 
Wrlght-Patterson  AFB,  OH  45433 


8.  PERFORMI 


8.  CONTRACT  OR  GRANT  NUMBERCi; 


10.  PROGRAM  ELEMENT.  PROJECT.  TASK 
AREA  A WORK  UNIT  NUMBERS 


n.  CONTROLLING  OFFICE  NAME  AND  ADDRESS 

Air  Force  Avionics  Laboratory 
Wrlght-Patterson  AFB,  OH  45433 


«.  MONITORING  AGENCY  NAME  A ADORESS^If  different  from  Controlling  Office) 

15.  SECURITY 

Unclassified 

IS*.  DECLASSIFICATION/ downgrading 

schedule 

<8.  distribution  STATEMENT  (of  Ihit  Koporfj 

Approved  for  Public  Release;  distribution  unlimited. 


I *7.  distribution  statement  (of  Iho  obolroct  oniorod  In  Slock  20,  If  dlffotonl  from  Report) 


IS.  SUPPLEMENTARY  NOTES 


9^; 


19.  KEY  WORDS  (Conflrtuo  on  re 

Kalman  Filter 

Square-Root  Filter 

U-D  Covariance  Factorization  Filter 

Wordlength 


ector  of  Information 


00  0id0  it  n0C00000y  0nd  id0ntify  by  block  nuatb0r) 

Algorithms 

Filter 

jrlzatlon  Filter  Estimation 

Optimal  Filter 


umerlcs 


I.  ABSTRACT  fConllmM  on  roi^oroo  oldo  It  nocooomry  ond  Idontify  by  block  numbot) 

This  report  presents  the  concept  of  square  root  filters  and  the  closely 
related  U-D  covariance  factorization  filter  as  viable  alternatives  to  conven- 
tional Kalman  filters.  For  a modest  increase  in  computational  loading,  one 
obtains  optimal  filter  algorithms  equivalent  to  the  Kalman  filter  if  infinite 
wordlength  is  assumed,  but  with  vastly  superior  numerical  characteristics  with 
finite  wordlength.  These  algorithms  are  at  least  as  good  a solution  to 
troublesome  measurement  update  computations  as  implementing  a Kalman  filter  in 
double  precision,  since  the  Kalman  filter  inherently  inv 


nn  ^ORM 
W 1 JAN  7» 


EDITION  OF  I NOV  88  IS  OBSOLETE 


unclassified 

SECURITY  CLASSIFICATION  OF  ' 


HIS  PAGE  (WhenjOetm  Entered) 

X--e— 


UNCLASSIFIED 

SeCUWITy  CLASSIFIC/kTIOW  OF  This  P»GEf<Wl«n  Dmim  Bnimd) 


20.  continued 

The  filter  algorithms  are  developed  and  presented  In  a form  convenient  for 
Implementation.  Of  the  covariance  square  root  forms,  the  Carlson  filter  Is 
more  efficient  than  the  Potter  form  computationally,  and  it  also  maintains 
triangularity  of  the  square  root  matrices.  The  U-D  covariance  factorization 
filter  is  even  more  efficient,  not  requiring  square  root  computations.  In 
comparison,  the  Inverse  covariance  square  root  filter  is  often  considerably 
more  burdensome,  although  It  too  becomes  competitive  if  the  measurement 
dimension  is  very  large. 


